Using semi-supervised classifiers for credit scoring

نویسندگان

  • Kenneth Kennedy
  • Brian Mac Namee
  • Sarah Jane Delany
چکیده

In credit scoring, low-default portfolios are those for which very little default history exists. This makes it problematic for financial institutions to estimate a reliable probability of a customer defaulting on a loan. Banking regulation (Basel II Capital Accord), and best practice, however, necessitate an accurate and valid estimate of the probability of default. In this article the suitability of semi-supervised one-class classification algorithms as a solution to the lowdefault portfolio problem are evaluated. The performance of one-class classification algorithms is compared with the performance of supervised two-class classification algorithms. This study also investigates the suitability of oversampling, which is a common approach to dealing with low-default portfolios. Assessment of the performance of oneand two-class classification algorithms using nine real-world banking data sets, which have been modified to replicate low-default portfolios, is provided. Our results demonstrate that only in the near or complete absence of defaulters should semi-supervised one-class classification algorithms be used instead of supervised two-class classification algorithms. Furthermore, we demonstrate for data sets whose class labels are unevenly distributed that optimising the threshold value on classifier output yields, in many cases, an improvement in classification performance. Finally, our results suggest that oversampling produces no overall improvement to the best performing two-class classification algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new corporate credit scoring system using semi-supervised discriminant analysis

Corporate credit scoring is important for investors and banks in risk management. However, the high dimensional data available from public financial statements make credit analysis difficult. To address the problem, dimensionality reduction is a key step to enhance scoring accuracy. By using semi-supervised discriminant analysis (SSDA) and support vector machines (SVMs), this study develops a n...

متن کامل

Fuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring

There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...

متن کامل

Fuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring

There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...

متن کامل

Social Media-Driven Credit Scoring: the Predictive Value of Social Structures

While emerging economies have seen an explosion of social network site (SNS) adoption, these countries lack sophisticated credit scoring system or credit bureaus to predict creditworthiness of individuals. In this paper, we propose an SNS-based credit scoring method for micro loans using largescale observational data. We show empirical evidence that by incorporating social network metrics, we c...

متن کامل

Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JORS

دوره 64  شماره 

صفحات  -

تاریخ انتشار 2013